skip to main content


Search for: All records

Creators/Authors contains: "Fraser, Katherine"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. A bstract Anomaly detection relies on designing a score to determine whether a particular event is uncharacteristic of a given background distribution. One way to define a score is to use autoencoders, which rely on the ability to reconstruct certain types of data (background) but not others (signals). In this paper, we study some challenges associated with variational autoencoders, such as the dependence on hyperparameters and the metric used, in the context of anomalous signal (top and W ) jets in a QCD background. We find that the hyperparameter choices strongly affect the network performance and that the optimal parameters for one signal are non-optimal for another. In exploring the networks, we uncover a connection between the latent space of a variational autoencoder trained using mean-squared-error and the optimal transport distances within the dataset. We then show that optimal transport distances to representative events in the background dataset can be used directly for anomaly detection, with performance comparable to the autoencoders. Whether using autoencoders or optimal transport distances for anomaly detection, we find that the choices that best represent the background are not necessarily best for signal identification. These challenges with unsupervised anomaly detection bolster the case for additional exploration of semi-supervised or alternative approaches. 
    more » « less
  2. null (Ed.)
    A bstract One of the key tasks of any particle collider is measurement. In practice, this is often done by fitting data to a simulation, which depends on many parameters. Sometimes, when the effects of varying different parameters are highly correlated, a large ensemble of data may be needed to resolve parameter-space degeneracies. An important example is measuring the top-quark mass, where other physical and unphysical parameters in the simulation must be profiled when fitting the top-quark mass parameter. We compare four different methodologies for top-quark mass measurement: a classical histogram fit similar to one commonly used in experiment augmented by soft-drop jet grooming; a 2D profile likelihood fit with a nuisance parameter; a machine-learning method called DCTR; and a linear regression approach, either using a least-squares fit or with a dense linearly-activated neural network. Despite the fact that individual events are totally uncorrelated, we find that the linear regression methods work most effectively when we input an ensemble of events sorted by mass, rather than training them on individual events. Although all methods provide robust extraction of the top-quark mass parameter, the linear network does marginally best and is remarkably simple. For the top study, we conclude that the Monte-Carlo-based uncertainty on current extractions of the top-quark mass from LHC data can be reduced significantly (by perhaps a factor of 2) using networks trained on sorted event ensembles. More generally, machine learning from ensembles for parameter estimation has broad potential for collider physics measurements. 
    more » « less
  3. Konukman, Ferman (Ed.)
    Background Many schools have been cutting physical education (PE) classes due to budget constraints, which raises the question of whether policymakers should require schools to offer PE classes. Evidence suggests that PE classes can help address rising physical inactivity and obesity prevalence. However, it would be helpful to determine if requiring PE is cost-effective. Methods We developed an agent-based model of youth in Mexico City and the impact of all schools offering PE classes on changes in weight, weight-associated health conditions and the corresponding direct and indirect costs over their lifetime. Results If schools offer PE without meeting guidelines and instead followed currently observed class length and time active during class, overweight and obesity prevalence decreased by 1.3% (95% CI: 1.0%-1.6%) and was cost-effective from the third-party payer and societal perspectives ($5,058 per disability-adjusted life year [DALY] averted and $5,786/DALY averted, respectively, assuming PE cost $50.3 million). When all schools offered PE classes meeting international guidelines for PE classes, overweight and obesity prevalence decreased by 3.9% (95% CI: 3.7%-4.3%) in the cohort at the end of five years compared to no PE. Long-term, this averted 3,183 and 1,081 obesity-related health conditions and deaths, respectively and averted ≥$31.5 million in direct medical costs and ≥$39.7 million in societal costs, assuming PE classes cost ≤$50.3 million over the five-year period. PE classes could cost up to $185.5 million and $89.9 million over the course of five years and still remain cost-effective and cost saving respectively, from the societal perspective. Conclusion Requiring PE in all schools could be cost-effective when PE class costs, on average, up to $10,340 per school annually. Further, the amount of time students are active during class is a driver of PE classes’ value (e.g., it is cost saving when PE classes meet international guidelines) suggesting the need for specific recommendations. 
    more » « less